Feasibility Study for Ellipsis Resolution in Dialogues by Machine-Learning Technique YAMAMOTO Kazuhide and SUMITA

نویسنده

  • YAMAMOTO Kazuhide
چکیده

A method for resolving the ellipses that appear in Japanese dialogues is proposed. This method resolves not only the subject ellipsis, but also those in object and other grammatical cases. In this approach, a machine-learning algorithm is used to select the attributes necessary for a resolution. A decision tree is built, and used as the actual ellipsis resolver. The results of blind tests have shown that the proposed method was able to provide a resolution accuracy of 91.7% for indirect objects, and 78.7% for subjects with a verb predicate. By investigating the decision tree we found that topic-dependent attributes are necessary to obtain high performance resolution, and that indispensable attributes vary according to the grammatical case. The problem of data size relative to decision-tree training is also discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Feasibility Study for Ellipsis Resultion in Dialogues by Machine-Learning Technique

A method for resolving the ellipses that appear in Japanese dialogues is proposed. This method resolves not only the subject ellipsis, but also those in object and other grammatical cases. In this approach, a machine-learning algorithm is used to select the attributes necessary for a resolution. A decision tree is built, and used as the actual ellipsis resolver. The results of blind tests have ...

متن کامل

Corpus-Based Anaphora Resolution Towards Antecedent Preference

In this paper we propose a corpus-based approach to anaphora resolution combining a machine learning method and statistical information. First, a decision tree trained on an annotated corpus determines the coreference relation of a given anaphor and antecedent candidates and is utilized as a lter in order to reduce the number of potential candidates. In the second step, preference selection is ...

متن کامل

Multiple Decision-Tree Strategy for Error-Tolerant Ellipsis Resolution

A new approach to robust ellipsis resolution for spoken-language translation is proposed. The strategy consists of a multiple decision-tree (MDT) model and a preference strategy. The proposed MDT model is an extension of decision tree model, thus it is exible since it is language-independent and taskindependent. The preference strategy is a simple but strong preference. We will show that it can...

متن کامل

Development of a Japanese-English Software Manual Paralell Corpus

To address the shortage of Japanese-English parallel corpora, we developed a parallel corpus by collecting open source software manuals from the Web. The constructed corpus contains approximately 500 thousand sentence pairs that were aligned automatically by an existing method. We also conducted statistical machine translation (SMT) experiments with the corpus and confirmed that the corpus is u...

متن کامل

A theme structure method for the ellipsis resolution

The purpose of this paper is to solve the contextual ellipsis problem that is popular in our Chinese spoken dialogue system named EasyNav. A Theme Structure is proposed to describe the attentional state. Its dynamic generation feature makes it suitable to model the topic transition in user-initiative dialogues. By studying the differences and the similarities between the ellipsis and the anapho...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998